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DETAILED ACTION 

1 . The text of those sections of Title 35, U.S. Code not included in this action can be found 
in a prior Office action. 

Response to Amendments 

2. This communication is responsive to the applicant's amendment dated 03/21/2005. 
Applicant amended claims 21-23 and cancelled claim 24 (see pages 11-13). 

Response to Arguments 

3. Applicant's arguments filed 03/21/2005, with respect to claims 1-12 and 14-13, have been 
fully considered but they are not persuasive. In order to reflect the applicant's amendments, the 
claim rejection is modified, see below. 

4. It is noted that the applicant cites several case laws regarding prima facie case of 
obviousness (page 15, last paragraph to page 16, paragraph 2), without specific argument 
regarding this issue, particularly for combination of prior art of Sharman and Hata. Thus, as a 
general response, applicant is directed to the related claim rejection. 



5. In response to applicant's arguments against the references individually (regarding claims 
1,4-6 and 9-12, see amendment: page 16, last paragraph to page 18, last paragraph), one cannot 
show nonobviousness by attacking references individually where the rejections are based on 
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combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re 
Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). 

It is noted that, as stated in the rejection, Sharman teaches looking up word in dictionary, 
removing possible prefix or suffix, concatenating diphone speech sample (smali unit) together, 
and producing different output units (token, word, phoneme, syllable: suggest the capability of 
using larger pronunciation units ); Hata teaches dictionary entries using sampled sounds 
associated with the corresponding word, particularly, teaches that whether to store words or 
phonemes (smaller unit) is a system design issue (motivation to use larger sample unit), therefore 
it would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine Sharman and Hata to provide a stored speech sample in a word or other larger units 
(the removed prefix and/or suffix may be good candidate units, since they must associate some 
pronunciation unit for outputting, anyway). Further, the examiner disagrees with applicant's 
arguments that the third reference (a normal dictionary), showing prefix or suffix as the same 
entry as a word, is "simply irrelevant, and does not provide any motivation, suggestion. . ." It is 
noted that the most of dictionaries have prefix and suffix entries, just like a word entry, and the 
references (Sharman and Hata) both utilize TTS dictionary for looking up words, so that it is 
obvious to ordinary skill in the art to provide a TTS dictionary with prefix and/or suffix entries, 
as the same way as in a normal dictionary . 

Regarding other independent and dependent claims (the amendment: pages 19-20), the 
response is based on the same reason described above, since they include the same or similar 
argued limitation. 
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As above reason, the applicant's arguments are not persuasive, and the rejection is 
sustained. 

Claim Rejections - 35 USC § 103 
6. Claims 1, 4-6, 9-12 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sharman (US 5,774,854) in view of Hata et al. (US 5,878,393) hereinafter referenced as Hata 
and "new riverside university dictionary" hereinafter referenced as DIC. 

Regarding claim 1, Sharman discloses a text to speech system, comprises: 
"receiving a list of textual units, wherein said textual units in the list comprise words, 
prefixes and suffixes", (column 2, lines 1-2, c a linguistic processor for generating a listing (list) 
of speech segments (textual units) . . . from the input (received) text'; column 5, lines 1 8-27, 
'removing (separate) any possible prefix or suffix, to see if the word, is related to one that is 
already in the dictionary'); 

"for each textual unit in the list, locating an associated speech sample in memory, said . 
memory comprising vocabulary of words, prefixes and suffixes and a plurality of speech 
samples" (column 5, lines 'using a dictionary look-up (necessarily stored in memory)', 'breaking 
words down into syllables' and 'removing any possible prefix and suffix (also necessarily stored 
in a buffer or memory in order to produce output speech)'; column 6, lines 25-67, 'many samples 
(associated speech sample) of each diphone are collected', 'the relevant diphone are the retrieved 
(located) from the diphone library (necessarily stored in a memory) and concatenated together by 
the diphone concatenation unit 415 (PSQLA)'; Table 1 and column 7, lines 54-67, 'the output 
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buffer (memory) is used when a component produces several output units for each input unit 
that receives', including 'token', 'word', 'phoneme', 'syllable', see column 7, table 1); and 

^"appending said associated speech sample to an output signal", (column 6, lines 25-67, 
'concatenated together (appending) by the diphone (associated sample) concatenation unit 415 
(PSOLA)', and 'produces the acoustic waveform (output signal)'). 

Even though, Sharman discloses using a dictionary look-up for words, break words down 
into syllables and removing possible prefix and suffix as stated above, Sharman does not 
expressly disclose "each speech sample corresponding to a one of said words, ... in said 
vocabulary". However, this feature is well known in the art as evidenced by Hata who discloses 
high quality concatenative reading system (title), and teaches "a dictionary of sampled sounds 
recorded and stored in advance", "the dictionary (including vocabulary) entries can be 
individual words", "the dictionary of samples may store more elemental speech components, 
such as individual phonemes" that means the system is capable of using larger units (such as 
word) or smaller units (such as phoneme) of speech sample, and that "whether to store entire 
words or individual phonemes is largely a system design issue" (column 3, lines 42-65). Hata 
also discloses data structure(s) including arrays, such as the dictionary, word list and phonologic 
feature table (see Fig. 3). Therefore, it .would have been obvious to one of ordinary skill in the 
art at the time the invention was made to modify Sharman for specifically providing a stored 
speech sample corresponding to a word or other unit, as taught by Hata, for the purpose of 
selecting the appropriated "granularity" or dictionary entry size to suit the specific application" 
(Hata: column 3, lines 66-67). 
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Even though, Sharman in view of Hata discloses a stored speech sample corresponding to 

a word (or other unit) in a vocabulary (dictionary), as stated above, Sharman does not expressly 

disclose "each speech sample corresponding to a one of said prefixes and suffixes in said 

vocabulary". However, the feature that a prefix or suffix being an entry treated as same way as a 

word in a dictionary is well known in the art as evidenced by DIC that teaches that prefixes and 

suffixes can be individually treated as entries, just like word entries in a dictionary (see DIC: 

entries "a-" and "-ability"). Therefore, it would have been obvious to one of ordinary skill in the 

art at the time the invention was made to modify Sharman in view of Hata for specifically 

providing a mechanism to treat a prefix or suffix as same way as a word in a dictionary, as taught 

by DIC, so that each stored speech sample can correspond to one of word, prefix and suffix, for 

the purpose of selecting the appropriated "granularity" or dictionary entry size to suit the specific 

application" (Hata: column 3, lines 66-67). 
* 

Regarding claim 4 (depending on claim 1), Sharman further discloses that processing 
input text at the substring level is based on a syllabified word (Sharman: column 5, line 31), so 
that combining the prior art features as applied above, the combined system satisfies all 
limitations as the claimed "for each textual unit in said consecutive plurality of said textual units, 
locating an associated speech sample in said memory; creating a speech unit by splicing together 
said plurality of associated speech samples; and appending said speech unit to said output 
signal." 

Regarding claim 5 (depending on claim 4), Sharman further discloses components of 
identifying diphones 410 (Fig. 4), diphone library 420 and diphone concatenation 415 for 
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overcoming audible discontinuities (column 6, lines 34-40), which corresponds to the claimed 
"after said splicing, processing said speech unit to remove discontinuities." 

Regarding claim 6, Sharman discloses a text to speech system, by using a linguistic 
processor for various linguistic processes (Figs. 2-3), comprising: 

"receiving a text file", (column 2, line 2-3, 'input text'; column 5, lines 1-2, 'obtain input 
from a source, such as ... a stored file'); 

"parsing said text file into textual units, where each said parsed textual unit is one of a 
word, a prefix and a suffix", (column 5, lines 3-40, 'split input text into tokens (words)', 
implement special rules 'to map lexical items into canonical word form', 'using a dictionary 
look-up', 'remove any possible prefix or suffix'); and 

"for each one of said parsed textual units, if said one of said parsed textual units 
corresponds to a stored textual unit in a vocabulary of textual units, and adding said stored 
textual unit to a list", (column 2, lines 1-2, 'generating a listing (list) of speech segments 
(equivalent textual units) ... from the input text', herein the list is inherently stored in a buffer; 
column 5, lines 26-27 'to see if the word is related to one that is already in the dictionary'; 
column 6, lines 61-66 and column 7, Table 1, 'output unit represents the size of the text unit 
(including word, phoneme)' used for different process stages; column 7, lines 45-66, 'output 
buffer is also used when a component produces several outputs units for each input unit that it 
receives', herein inherently including adding prefix and suffix to the buffer because without 

storing them in the buffer the system cannot output required speech). 

I 

J Even though, Sharman discloses using a dictionary look-up for words, break words down 
into syllables and removing possible prefix and suffix as stated above, Sharman does not 
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expressly disclose "wherein said vocabulary of textural unit comprises words . . . each having a 
pre-recorded speech sample associated therewith". However, this feature is well known in the 
art as evidenced by Hata who discloses high quality concatenative reading system (title), and 
teaches "a dictionary of sampled sounds (speech sample) recorded and stored in advance (pre- 
recorded)", "the dictionary (including vocabulary) entries can be individual words", "the 
dictionary of samples may store more elemental speech components, such as individual 
phonemes" that means the system is capable of using larger units (such as word) or smaller units 
(such as phoneme) of speech sample, and that "whether to store entire words or individual 
phonemes is largely a system design issue" (column 3, lines 42-65). Hata also discloses data 
structure(s) including arrays, such as the dictionary, word list and phonologic feature table (see 
Fig. 3). Therefore, it would have been obvious to one of ordinary skill in the art at the time the 
invei^ti was made to modify Sharman for specifically providing a stored speech sample 



corresponding to a word or other unit, as taught by Hata, for the purpose of selecting the 
I 

appropriated "granularity" or dictionary entry size to suit the specific application" (Hata: column 

i 

3, lines 66-67). 

Even though, Sharman in view of Hata discloses a stored speech sample corresponding to 
a word (or other unit) in a vocabulary (dictionary), as stated above, Sharman does not expressly 
disclose "wherein said vocabulary of textural unit comprises . . . prefixes and suffixes each 
having a pre-recorded speech sample associated therewith". However, the feature that a prefix or 
suffix being an entry treated as same way as a word in a dictionary is well known in the art as 
evidenced by DIG that teaches that prefixes and suffixes can be individually treated as entries, 
just like word entries in a dictionary (see DIC: entries "a-" and "-ability"). Therefore, it would 
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have been obvious to one of ordinary skill in the art at the time the invention was made to 
modify Sharman in view of Hata for specifically providing a mechanism to treat a prefix or 
suffix as same way as a word in a dictionary, as taught by DIC, so that each stored speech 
sample can correspond to one of word, prefix and suffix, for the purpose of selecting the 
appropriated "granularity" or dictionary entry size to suit the specific application" (Hata: column 

W 

3, lines 66-67)Regarding claim 9, it recites an apparatus, which corresponds to the method of 

I 

claim 1. The rejection is based on the same reason as described for claim 1, because claim 9 
recites the same or similar limitation(s) as claim 1 . 

Regarding claim 10, it recites an apparatus having a . processor (see preamble) that 
corresponding to the Sharman's disclosure 'the TTS system includes two microprocessors' 
(column 3, line 17). For rest of the limitations, the rejection is based on the same reason as 
described for claim 1, because claim 10 recites the same or similar limitation(s) as claim 1. 

Regarding claim 11, it recites a computer readable medium for providing program 
control to a processor included in a text to speech converter (see preamble) that is read on the 
Sharman's disclosure that an arrangement is particularly suitable for a workstation (equivalent to 
computer) equipped with an adapter card with its own DSP (equivalent to processor) (column 3, 
line 21). For rest of the limitations, the rejection is based on the same reason as described for 
claim 1, because claim 9 recites the same or similar limitation(s) as claim 1. 

Regarding claim 12, it recites an apparatus. The rejection is based on the same reason as 
described for claims 1 and 6, because claim 12 recites the same or similar limitation(s) as claims 
1 and 6. 
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7. Claims 2-3 and 21-23 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sharman in view of Hata and DIC as applied to claims 1 and 12 above, and further in view of Oh 
(US 6,141,642). 

Regarding claim 2 (depending on claim 1), Sharman in view of Hata and DIC further 
discloses: 

"when a one of said textual units in said list is indicated as not having an associated 
speech sample in memory", "passing said indicated textual unit", (column 5, lines 24-26, 'it is 
useful to include some back-up mechanism to be able to process (pass) words that are not in the 
dictionary'); and 

"appending said converted speech sample to said output signal" (as applied in claim 6). 

But, Sharman in view of Hata and DIC does not expressly discloses "passing said 
indicated textual unit to a secondary text to speech engine; receiving a speech sample converted 
from said indicated textual unit from said secondary text to speech engine". However, this 
feature is well known in the art as evidenced by Oh who discloses text-to-speech apparatus and 
method for processing multiple languages (title), comprising a plurality of test-to-speech engines 
for converting the sub-texts into audio wave data (speech sample)(column 1, line 65 to column 2, 
line 5), and illustrates a structure (Fig. 2) having two TTS engines, wherein when a character 
(text unit) of other language is detected the control is transferred to the other TTS engine 
(secondary TTS), including lexical analysis, parsing, converting the input (received) text 
(column 4, lines 23-53, and column 5, lines 1-10). Therefore, it would have been obvious to one 
of ordinary skill in the art at the time the invention was made to modify Sharman in view of Hata 
and DIC by specifically providing a secondary TTS for further processing the unmatched text, 




! 
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for the purpose of generating appropriated sound for a multiple language text (Oh: column 1, 
lines 57-58) 

Regarding claim 3 (depending on claim 2), Sharman in view of Hata, DIC and Oh 
discloses that said secondary text-to-speech engine comprises a phonetic text-to-speech engine 
based on a voice talent, (Hata: Fig. 1 and column 3, 42-45,' the reading system has a dictionary 
of sampled sounds 40'; column 3, line 26-31, 'the individual speech samples (equivalent to voice 
talent) each represent discrete units of speech, such as phonemes or words'). 

Regarding claim 21, it recites a method; the rejection is based on the same reason 
described for claims 1-2 and 6, because the claim recites the same or similar limitation(s) as 
combined claims 1-2 and 6. 

Regarding claim 22 (depending on claim 21), the rejection is based on the same reason as 
described for claim 6, because the claim recites the same or similar, limitation(s) as claim 6. 

Regarding claim 23, it recites an apparatus, which corresponds to a combination of 
method claims 1, 2 and 6; the rejection is based on the same reason as described for claims 1, 2 
and 6, because the rejection for claims 1,2 and 6 covers the same or similar limitation(s) of 
claim 23. 



8. Claim 7 is rejected under 35 U.S.C. 103(a) as being unpatentable over Sharman in view 
of Hata and DIC as applied to claim 6 above, and further in view of Microsoft Press ("Computer 
Dictionary", page 298) hereinafter referenced as Rl . 

J ^ legarding claim 7 (depending on claim 6), Sharman particularly discloses that apart 
from using a dictionary look-up, 'it is useful to include some back-up mechanism to be able to 



i 
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process words that are not in the dictionary' (column 5, lines 24-26), which is corresponding to 
the claimed "if said one of said parsed textual units does not correspond to one of said stored 
textual units" and "as being out of vocabulary." Sharman further recites that 'the output unit 
represents the size of the text unit (e.g. word, sentence, phoneme); for many stages this is 
accompanied by additional information for that unit (e.g., duration, part of speech etc.)'(column 
6, line 59 to column 7, line 2), which means that the text unit may be different in each of 
processing stages. But, Sharman in view of Hata and DIC does not expressly disclose to mark a 
text unit that does not match the one either in dictionary or by rule sets. However, this feature of 
marking a text unit data was well known in the art as evidenced by Rl, which is a popular 
computer dictionary that gives common meaning and explanation of words or phrases in 
computer related arts. Rl further discloses that one of the common meanings of the word 
"mark" is "in applications and data storage, a symbol or other device used to distinguish one 
item from others like it" (page 298, entry "mark"), so that when using "mark" as a verb, it can be 
interpreted as an action to mark a symbol for certain data in a data storage, such as used for "text 

unit" jfor distinguishing the data from other data. Therefore, it would have been obvious to one 

I 

of ordinary skill in the art at the time the invention was made to modify Sharman by specifically 
marking a text unit of the processed data, as taught by Rl, for the purpose of distinguishing the 
text unit that is not in the dictionary and preparing for further processing stages, such as 
processing in a back-up mechanism, generating phonemes, coping with prosodic information 
(Sharman, column 5, lines 25-26, column 5, lines 30-56 and column 5, lines 26). In addition, 
there must inherently exist some mechanism to distinguish a word that is not in the dictionary 
from other word that is in the dictionary in Sharman system, because Sharman suggest using a 
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dictionary lookup and some back-up mechanism for handling the two different situations 
(column 5, lines 23-25). 

9. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over Sharman in view 
of Hata, DIC and Rl as applied to claim 7 above, and further in view of O'Donnell 
("programming for the world--a guide to internationalization", ISBN 0-13-722190-8). 

Regarding claim 8 (depending on claim 7), Sharman in view of Hata, DIC and Rl does 
not expressly disclose that "said marking comprises pre-pending a character to said textual unit." 
However, the further of marking a text unit by using a pre-pending character was well known, as 
taught by O'Donnell who writes a book of 'programming for the world', and discloses that 
appending a character symbol "$" to a digit string for distinguishing monetary amount from 
normal number (page 49, table 2.1 1). Therefore, it would have been obvious to one of ordinary 
skill in the art at the time the invention was made to modify Sharman and Rl by specifically 
marking a text unit of the processed data by adding a character, such as "$" or the like, in front of 
the text units, as taught by O'Donnell, for the purpose of easily distinguishing the text units and 
preparing for further processing. 

10. Claims 14-15 and 19-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sharman in view of Hata, DIC and Malsheen et al. (US 4,979,216) hereinafter referenced as 
Malsheen. 

Regarding claim 14, Sharman discloses a text to speech system, comprising: 
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"a field for a textual unit", (column 7, lines 59-57 and Table 1, 'output buffer is also used 
when a component produces several output unit for each input unit that receives'; column 6, lines 
59-67, 'the output unit represents the size of the text unit (e.g. word, phoneme)' in several 
different process stages; which necessary includes data structure and a field for handling the text 
unit), 

, "a field for speech sample associated with said textural unit", (Fig. 4 and column 6, lines 

I 

25-40f 'many samples (speech samples) of each phonemes are collected... for use in the diphone 
library 5 , 'relevant diphones (associated with the text units) ...are concatenated together by 

v. 
/ 

diphone concatenation unit 415 (PSOLAY, which necessary includes data structure and a field 

X 

for handling the synthesis speech), 

"wherein said textual units is one of a word, prefix and suffix" (column 2, lines 1-2, 'a 
linguistic processor for generating a listing (list) of speech segments (textual units) . . . from the 
input text'; column 5, lines 18-27, 'removing (separate) any possible prefix or suffix (treated as 
a textural unit) to see if the word, is related to one that is already in the dictionary'), and 

"wherein a processor is capable of using the data structure to locate said associated 
speech sample associated with said textual unit from a memory comprising a vocabulary of 
words, prefixes and suffixes and a plurality of speech samples, and to use said associated speech 
sample to produce an output signal", (column 3, lines 17-18, the TTS system includes two 
microprocessors'; column 5, lines 'using a dictionary look-up (necessarily stored in memory)', 
'breaking words down into syllables' and 'removing any possible prefix and suffix (also 
necessarily stored in a buffer or memory in order to produce output speech'; column 6, lines 25- 
67, 'many samples (associated speech sample) of each diphone are collected'; Table 1 and 
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t 

column 7, lines 54-67 5 'the output buffer (memory) is used when a component produces several 
output units for each input unit that receives', including 'token', 'word', 'phoneme', 'syllable'; 
column 6, lines 25-67, 'concatenated together (appending) by the diphone (associated sample) 
concatenation unit 415 (PSOLA)', and 'produces the acoustic waveform (output signal)'). 

Even though, Sharman discloses using a dictionary look-up for words, break words down 
into syllables and removing possible prefix and suffix as stated above, Sharman does not 
expressly disclose "each speech sample corresponding to a one of said words, . . .in said 
vocabulary". However, this feature is well known in the art as evidenced by Hata who discloses 
high quality concatenative reading system (title), and teaches "a dictionary of sampled sounds 
recorded and stored in advance", "the dictionary (including vocabulary) entries can be 
individual words", "the dictionary of samples may store more elemental speech components, 
such as individual phonemes" that means the system capable of larger units (such as word) or 
smaller units (such as phoneme) of speech sample, "whether to store entire words or individual 
phonemes is largely a system design issue" (column 3, lines 42-65). Therefore, it would have 
been obvious to one of ordinary skill in the art at the time the invention was made to modify 
Sharman for specifically providing a stored speech sample corresponding to a word or other unit, 
as taught by Hata, for the purpose of selecting the appropriated "granularity" or dictionary entry 
size to suit the specific application" (Hata: column 3, lines 66-67). 

Even though, Sharman in view of Hata discloses a stored speech sample corresponding to 
a word (or other unit) in a vocabulary (dictionary), as stated above, Sharman does not expressly 
disclose "each speech sample corresponding to a one of said . . . , prefixes and suffixes in said 
vocabulary". However, the feature that a prefix or suffix being an entry treated as same way as a 
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word in a dictionary is well known in tKe art as evidenced by DIC that teaches that prefixes and 
suffixes can be individually treated as entries, just like word entries in a dictionary (see DIC: 
entries "a-" and "-ability"). Therefore, it would have been obvious to one of ordinary skill in the 
art at the time the invention was made to modify Sharman in view of Hata for specifically 
providing a mechanism to treat a prefix or suffix as same way as a word in a dictionary, as taught 
by DIC, so that each stored speech sample can correspond to one of word, prefix and suffix, for 
the purpose of selecting the appropriated "granularity" or dictionary entry size to suit the specific 
application" (Hata: column 3, lines 66-67). 

Further, Sharman in view of Hata and DIC does not expressly disclose the data structure 
having "a field for a frequency of a first portion of the speech sample that exceeds an amplitude 
threshold, and a field for a frequency of a last portion of the speech sample that exceeds an 
amplitude threshold," which can be broadly interpreted as a data structure feature having simple 
data fields for storing a frequency or duration related speech information, since this limitation 
does not specifically define any type of the data in the data structure design, non describe any 
relationship with other data fields or incorporation with other system elements. However, this 
feature is well known in the art as evidenced by Malsheen who discloses the data structures for 
storing a single phoneme enunciations (column 5, line 65 through column 6, line 26), and having 
multiple frequency and time (duration) fields (Table 1-4). As best understood in view of 
specification (page 9, paragraph 3 and page 10, paragraph 2), the field for a frequency of a first 
(or last) portion of the speech sample that exceeds an amplitude threshold can be interpreted as 
zero crossing data, which is inherently related to frequency or duration information about pitch 
that can be equivalently expressed in frequency, so that Malsheen disclosed data structure having 
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multiple frequency or time (duration) fields can be used for implementing the two claimed data 
fields. Therefore, it would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Sharman in view of Hata and DIC by specifically providing data 
structures having multiple fields for frequency or time (duration) information for processing and 
storing speech data, as taught by Malsheen, for the purpose of reducing cost (Malsheen: column 
2, line 57). 

In addition, in a broader view, a data structure is a template that data can be applied to. 
For computer and/or microprocessor based devices, data structure is an inherent nature for 
storing, accessing the required data through associated hardware and/or software functionalities. 
The claimed data structure includes two general fields for use without any specific data type 
(such as text, number, length) and any connection to other software and hardware, so that, in 
fact, any two data elements relating frequency or duration information can apply to the two fields 
of the template, thus Sharman, Hata, and Malsheen may, either individually or in combine, 
satisfy the limitation of these to fields. 

Regarding claim 15 (depending on claim 14), the claim only adds two more fields which 
is interpreted as the template with few more fields that any data can be applied to as stated above 

(claim 4, last paragraph), so that Sharman and Hata and Malsheen can, either individually or in 

1 

combine, satisfy the claimed limitation(s). In addition, Sharman in view of Hata, DIC and 
Malsheen further discloses a phonological feature table (an array type of data structure) 52 (Hata: 
Fig. 3), comprising fields of phonemes that a word may begin and end with (Hata: column 5, 
lines 14-31, and column 7, lines 55-59), which further corresponds to the claimed "a field for a 
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phoneme that said textual unit starts with, and a field for a phoneme that the textual unit ends 
with." 

Regarding claims 19 and 20 (depending on claim 14), the rejection is based on same or 
similar reason described in claim 14, because these claims only add three more fields which is 
interpreted as the template with few more fields that any data can be applied to, therefore 
Sharman and Hata and Malsheen can, either individually or in combine, satisfy the claimed 
limitation(s). 

11. W Claims 16 and 18 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sharman in view of Hata, DIC and Rl as applied to claims 7 and 12 above, and further in view 
ofOhf 

Regarding claim 16 (depending on claim 7), the rejection is based on the same reason as 
described for claim 2, because the claim recites the same or similar limitation(s) as claim 2. 

Regarding claim 18 (depending on claim 12), which corresponds to a combination of 
claims 2 and 7; the rejection is based on the same reason as described for claims 2 and 7, because 
the claim recites the same or similar limitation(s) as claims 2 and 7. 

12. Claim 17 is rejected under 35 U.S.C. 103(a) as being unpatentable over Sharman in view 
of Hata, DIC, Rl and O'Donnell as applied to claim 8 above, and further in view of Oh. 

Regarding claim 17 (depending on claim 8), the rejection is based on the same reason as 
described for claim 2, because the claim recites the same or similar limitation(s) as claim 2. 
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w Conclusion 

13. I Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1 .136(a). A 
shortened statutory period for reply to this final action is set to expire THREE MONTHS from 
the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the 
mailing date of this final action and the advisory action is not mailed until after the end of the 
THREE-MONTH shortened statutory period, then the shortened statutory period will expire on 
the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be 
calculated from the mailing date of the advisory action. In no event, however, will the statutory 
period for reply expire later than SIX MONTHS from the date of this final action. 

14. Please address mail to be delivered by the United States Postal Service (USPS) as 
follows: 

Mail Stop 

Commissioner for Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 
or faxed to: 571-273-8300, (for formal communications intended for entry) 
Or: 571-273-8300, (for informal or draft communications, and please label 
"PROPOSED" or "DRAFT") 

If no Mail Stop is indicated below, the line beginning Mail Stop should be omitted from 
the address. 

Effective January 14, 2005, except correspondence for Maintenance Fee payments, 
Deposit Account Replenishments (see 1.25(c)(4)), and Licensing and Review (see 37 CFR 5.1(c) 
and 5.2(c)), please address correspondence to be delivered by other delivery services (Federal 
Express (Fed Ex), UPS, DHL, Laser, Action, Purolater, etc.) as follows: 

U.S. Patent and Trademark Office 

Customer Window, Mail Stop 

Randolph Building 

Alexandria, V A 22314 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Qi Han whose telephone numbers is (571) 272-7604. The 
examiner can normally be reached on Monday through Thursday from 9:00 a.m. to 7:00 p.m. If 
attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, 
Richemond Dorvil, can be reached on (571) 272-7602. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Inquiries regarding the status of submissions 
relating to an application or questions on the Private PAIR system should be directed to the 
Electronic Business Center (EBC) at 866-217-9197 (toll-free) or 703-305-3028 between the 
hours of 6 a.m. and midnight Monday through Friday EST, or by e-mail at: ebc@uspto.gov. For 
general information about the PAIR system, see http://pair-direct.uspto.gov. 



QH/qh 

June 29, 2005 
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